AITopics | digital humanity association

Collaborating Authors

digital humanity association

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Izindaba-Tindzaba: Machine learning news categorisation for Long and Short Text for isiZulu and Siswati

Madodonga, Andani, Marivate, Vukosi, Adendorff, Matthew

arXiv.org Artificial IntelligenceJun-12-2023

Local/Native South African languages are classified as low-resource languages. As such, it is essential to build the resources for these languages so that they can benefit from advances in the field of natural language processing. In this work, the focus was to create annotated news datasets for the isiZulu and Siswati native languages based on news topic classification tasks and present the findings from these baseline classification models. Due to the shortage of data for these native South African languages, the datasets that were created were augmented and oversampled to increase data size and overcome class classification imbalance. In total, four different classification models were used namely Logistic regression, Naive bayes, XGBoost and LSTM. These models were trained on three different word embeddings namely Bag-Of-Words, TFIDF and Word2vec. The results of this study showed that XGBoost, Logistic Regression and LSTM, trained from Word2vec performed better than the other combinations.

artificial intelligence, dataset, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2306.07426

Country:

Africa > Southern Africa (0.05)
Europe > Germany > Saxony > Leipzig (0.05)
Asia > Middle East > Republic of Türkiye (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.58)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.52)

Add feedback

Textual Augmentation Techniques Applied to Low Resource Machine Translation: Case of Swahili

Gitau, Catherine, Marivate, VUkosi

arXiv.org Artificial IntelligenceJun-12-2023

In this work we investigate the impact of applying textual data augmentation tasks to low resource machine translation. There has been recent interest in investigating approaches for training systems for languages with limited resources and one popular approach is the use of data augmentation techniques. Data augmentation aims to increase the quantity of data that is available to train the system. In machine translation, majority of the language pairs around the world are considered low resource because they have little parallel data available and the quality of neural machine translation (NMT) systems depend a lot on the availability of sizable parallel corpora. We study and apply three simple data augmentation techniques popularly used in text classification tasks; synonym replacement, random insertion and contextual data augmentation and compare their performance with baseline neural machine translation for English-Swahili (En-Sw) datasets. We also present results in BLEU, ChrF and Meteor scores. Overall, the contextual data augmentation technique shows some improvements both in the $EN \rightarrow SW$ and $SW \rightarrow EN$ directions. We see that there is potential to use these methods in neural machine translation when more extensive experiments are done with diverse datasets.

artificial intelligence, machine translation, natural language, (14 more...)

arXiv.org Artificial Intelligence

2306.07414

Country:

Africa > Southern Africa (0.05)
North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
(8 more...)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

International conference of the Digital Humanities Association of Southern Africa

VideoLectures.NETMar-1-2022, 12:05:32 GMT

The Digital Humanities Association of Southern Africa (DHASA) is organizing its third conference with the theme “Digitally Human, Artificially Intelligent”. The field of Digital Humanities is currently still rather underdeveloped in Southern Africa. Hence, this conference has several aims. First, to bring together researchers who are interested in showcasing their research from the broad field of Digital Humanities. By doing so, this conference provides an overview of the current state-of-the-art of Digital Humanities especially in the Southern Africa region. This includes Digital Humanities research by people from Southern Africa or research related to the geographical area of Southern Africa. The DHASA conference is an interdisciplinary platform for researchers working on all areas of Digital Humanities (including, but not limited to language, literature, visual art, performance and theatre studies, media studies, music, history, sociology, psychology, language technologies, library studies, philosophy, methodologies, software and computation, etc.). It aims to create the conditions for the emergence of a scientific Digital Humanities community of practice.

digital humanity association, international conference, southern africa, (3 more...)

VideoLectures.NET

Country: Africa > Southern Africa (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language (0.80)

Add feedback